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(57) Abstract 

A method and system for electronic data— gathering 
system allows a user to easily capture and archive elec- 
tronic data without the need to interact with an additional 
application user-interface. The present invention stream- 
lines the workflow of performing research, and also ensures 
that information is easily traceable to its original source. 
The described embodiments of the present invention auto- 
matically encapsulate user-selected sets of electronic data 
with a set of attribution, creation, and user-defined meta- 
data. The system uses the captured data and metadata 
to create gem data objects. These gem data objects are 
then routed within the electronic data-gathering system. 
Hie gem data objects may be stored on a persistent stor- 
age mechanism. The research system also includes a data 
viewer that allows a user to view and perform actions upon 
the gem data objects. 
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A Method and System for Performing Electronic Data-Gathering 

Across Multiple Data Sources 

Inventors 
Ashwin Gulati 
5 William J. Blackburn 

Cross Reference to Related Application 

This application claims the benefit of U.S. Provisional Patent Application Serial No. 
60/1 14,065, "Virtual Gem", by Ashwin Gulati, filed December 28, 1998, and U.S. Provisional 
Patent Application Serial No. 60/160,639, "Method And System For Performing Electronic 
W Data-Gathering Across Multiple Data Sources", by Ashwin Gulati and William J. Blackburn, 
filed October 20, 1999, which subject matter is incorporated herein by reference. 

Background 

A. Technical Field 

The present invention relates generally to electronic data-gathering, and more 
15 particularly, to the automatic capture, storage and classification of selected portions of 
electronic data. 

B. Background of the Invention 

Researching a topic using digital sources has become a time intensive process due to 
the vast quantity of data available. In addition, a researcher needs to track and attribute all 
20 such data used to its original source. Digital information may be collected from such varied 
sources as text and image files, network sources, and the Internet The current process of 
collecting such data involves a complex series of interactions with the source data itself, in 
addition to computer-based or manual filing systems, word processors, and other authoring 
tools and applications that make the process non-linear and difficult to manage. 

25 Researchers using electronic media as sources for information and documentation can 

archive research data using file systems or indexing software solutions. The traditional 
method for storing a copy of information found during research is to save the entire document 
into such a file system for later retrieval, or alternatively to save a relevant section of a 
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document in a word processing or authoring type of computer application. These types of 
traditional computer applications for research storage include an underlying model describing 
the nature of the persistent data used and the functions that are available to operate on that 
data. The computer system will implement one or more views, typified in the modern GUI 
interface, allowing the user to interact with the model. 

For example, a user might want to save a selected portion of information from a web 
page viewed using a web browser. The user could save the complete web page; however, this 
would not only save the relevant information but would store the entire web page. Storing the 
entire web page would distract future users from focusing upon the relevant information 
within the web page. Such storing of extraneous information is undesirable in a research 
environment. 

The user could alternatively copy the relevant section of information from the web 
browser view. Next, the user would open up another application view, such as a word 
processing application, and paste the document segment into the application. The user would 
then have to save the new document as a new file. The user would also have to manually 
input creation and attribution information about the document into either the document itself 
or the file name of the document, so that the user could properly identify and attribute the 
information later. As used herein, creation information refers to information about the actions 
relating to saving the original data, such as the identity of the system user, the date and time of 
storage, and the source document from which the data was taken. Attribution information 
refers to bibliographic information such as the original author, date of publication, etc. 

Currently, multiple pieces of stored electronic data are most often viewed using the 
application with which they were stored. A user might store entire web pages, portions of text, 
spreadsheet information, and graphics. Each of these types of documents could potentially be 
5 stored in a different application format. When each piece of data is later retrieved, it is 
necessary to open multiple different applications for data viewing, adding complexity to the 
process. 

Traditional approaches to electronic data-gathering interrupt the workflow of a 
research project. For example, to save a current view, a researcher must stop interacting with 
30 the current view of the program he is using to perform his research; interact with a view of the 
file system browser, indexing application, or word processing application required to view 
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stored research information; and then return to the view he was accessing before. The 
preferred process would be to organize such research efforts into a stream where all relevant 
research materials are collected from source documents at the beginning of the project. These 
collected materials would be processed to automatically retain creation and attribution 
information about the materials, and then stored for future review. Furthermore, the data 
stored in this phase would be visible to other researchers working on the same, or even non- 
related projects. Finally, the data could be withdrawn as needed, along with all proper creation 
and attribution information, during the compilation of the final output document. 

No conventional system allows for the automatic capture, storage, and classification of 
less than an entire source document, including the acquisition and storage of creation and 
attribution metadata, packaged into a single, routable self-attributing format. Thus, there is a 
need for a system that will allow a researcher to acquire and use important pieces of data 
gleaned from electronic files of various types without stopping his or her work to interact with 
an additional application user-interface. Such a system would provide a conduit for inserting 
research data into a system model directly while negating or delaying the need to interact with 
a view of the system. 

Summary of the Invention 

The electronic data-gathering system of the present invention allows a user to easily 
capture and archive electronic data without the need to interact with an additional application 
user-interface. This streamlines the workflow of performing research, and also ensures that 
information is easily traceable to its original source. 

The described embodiments of the present invention automatically encapsulate user- 
selected sets of electronic data with a set of attribution, creation, and user-defined metadata. 
The system uses the captured data and metadata to create gem data objects. These gem data 
objects are then routed within the electronic data-gathering system. The gem data objects may 
be stored on a persistent storage mechanism, and viewed or edited by a system user. 

In one embodiment of the present invention, a user captures a set of electronic data via 
the system clipboard, drag and drop activity, manual type-in or a scanner. The data is input 
into a data target, which does not necessitate opening an additional program view. However, 
the user may optionally choose to input data using a system data viewer. The research system 
automatically captures a set of metadata and creates an interim object by appending the set of 
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metadata to the original captured data. This interim object is then passed to the main routine, 
where a gem data object is created. 

The research system also includes a data viewer that allows a user to perform actions 
upon the gem data objects. A user may perform a search for gem data objects using keywords 
5 or other attributes of gem data objects. A user may view the gem object database in a 
hierarchical manner, and create a storage hierarchy by placing gem data objects in various 
containers within the database. Additionally, a user may edit gem data objects using the data 
viewer. 

Advantages of the invention will be set forth in part in the description which follows 
w and in part will be obvious from the description or may be learned by practice of the invention. 
The objects and advantages of the invention will be realized and attained by means of the 
elements and combinations particularly pointed out in the appended claims and equivalents. 

Brief Description of the Drawings 

Fig. 1 is an example of an electronic data-gathering system in accordance with the 
15 present invention. 

Fig. 2 is a flowchart of a research workflow method of the present invention. 

Fig. 3 is a flowchart showing an example of a gathering phase for a research workflow 
method. 

Fig. 4A is an illustration of an acquire data step using a data target as used in an 
20 embodiment of the present invention. 

Fig. 4B is an illustration of an acquire data step using a gem data object graphical user 
interface view as used in an embodiment of the present invention. 

Fig. 5 is flowchart showing an example of a method for processing data into a gem 
data object. 

25 Fig. 6 is a flowchart of an example of a method for creating an interim object. 

Fig. 7 is a flowchart of an example of a method for creating a gem data object. 
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Fig. 8 is a diagram showing an example of a user interface window for a gem data 
object viewer with the "General" tab of the popup menu of a gem data object expanded. 

Fig. 9A is a diagram showing an example of a user interface window for a gem data 
object viewer with the "Bibliography" tab of the popup menu of a gem data object expanded. 

5 Fig. 9B is a diagram showing an example of the "Bibliography" tab of the popup menu 

of a gem data object expanded, with a scrolling menu displayed for the "Bibliography Format" 
option. 

Fig. 10 is a diagram showing an example of a user interface window for a gem data 
object viewer with the "View" tab of the popup menu of a gem data object expanded. 

10 Fig. 11 is a flowchart showing an example of a storage phase for a research workflow 

method. 

Fig. 12 is a flowchart showing an example of a performing data actions phase for a 
research workflow method. 

Fig. 13 is a flowchart showing an example of a data analysis process. 

25 Fig. 14 is an example of a root node as used in an embodiment of the present invention. 

Fig. 15 is an example of a container node as used in an embodiment of the present 
invention. 

Fig. 16 is an example of a container node as used in an embodiment of the present 
' invention. > .. 

20 Fig. 17 is an example of a gem data object as used in an embodiment of the present 

invention. 

Fig. 1 8 is a diagram showing an example of a gem data object viewer with a "View" 
popup menu expanded to show a text type gem data object. 

Fig. 19 is a diagram showing an example of a gem data object viewer with a "View" 
25 popup menu expanded to show a text type gem data object. 
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Fig. 20 is a diagram showing an example of a gem data object viewer with a "View" 
popup menu expanded to show a file type gem data object. 

Fig. 21 is a diagram showing an example of a gem data object viewer with a "View" 
popup menu expanded to show a graphics type gem data object. 

Fig. 22 is a diagram showing an example of a gem data object viewer with a "View" 
popup menu expanded to show a Uniform Resource Locator (URL) type gem data object. 
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Detailed Description of Embodiments 

Reference will now be made in detail to several embodiments of the present invention, 
examples of which are illustrated in the accompanying drawings. Wherever practicable, the 
same reference numbers will be used throughout the drawings to refer to the same or like 
5 parts. 

A. System Overview 

Fig. 1 is an illustration of an embodiment of an electronic data-gathering system for 
use in research operations. System 100 is used for creating, viewing, storing and using 
encapsulated packages of electronic data and metadata. These encapsulated packages of data 
10 and metadata are referred to herein as gem data objects. It should be understood that the term 
"gem data object" refers not only to a Gem from the eGems ™ software by Gemteq Software, 
Inc. but also to other types of objects. Different formats for creating, storing, and using gem 
data objects will be evident to one of skill in the art. 

The system of Fig. 1 is suitable for use with both data acquisition and data retrieval. 
15 First, the data acquisition functions will be discussed, followed by the data retrieval functions. 

The data acquisition functions of the electronic data-gathering system of Fig. 1 will be 
presented by broadly tracing the path of a set of data that a user wishes to enter into the 
system. First, electronic data must have a point of entry into a system 100. The system 100 
contains a data target 10 through which electronic data may be captured by the system 100. 
20 Alternatively, electronic data may be captured using a data scanner 20. The captured data is 
then passed to an analyzer/conduit 30, which receives the captured data, captures additional 
metadata, and passes the data and metadata to a client module 40. The metadata captured by 
the analyzer/conduit 30 may be any data describing or referring to the original data. 

The client module 40 now contains both the original data and the captured metadata. 
25 Within the client module 40, different processing modules operate on the data and metadata to 
create a gem data object, an encapsulated package of data and metadata, capable of being 
routed throughout the system 100. Specialized modules within the client module 40 operate 
on the gem data object and further enhance data acquisition, storage and usage. A 
plain/formatted text module 44, a graphics module 46, and an import/export module 56 make 
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up the core functionality for acquiring and using basic image and text data, as well as 
packaging that data for transfer between databases. 

The text module 44 is responsible for processing acquired text data sets. Text data sets 
may be defined using a system clipboard, drag and drop activity, manual typing-in of data, 
speech recognition of audio data, or scanning-in of data combined with optical character 
recognition (OCR). Processing text data includes recognizing the type of text, displaying it in 
a visual component, and performing editing operations upon it. 

The graphics module 46 is responsible for processing acquired image data sets. Image 
data sets may be defined using a system clipboard, drag and drop activity, or scanning-in of 
data. Processing includes recognizing the type of image, displaying it in a visual component, 
and performing editing operations upon it. 

The import/export module 56 handles the transfer of data between databases. In one 
embodiment, this module resides in the client module 40, but it may also reside in the server 
module 70. The import/export module 56 provides for the transfer of all or part of the current 
system database into a new or existing database or export file, as well as the transfer of some 
or all of a database or import file into the current system database. 

A language module 42 is capable of performing in-line language translation upon the 
captured data to convert from text in one language to text in another language. A speech-to- 
text module 48 allows for the translation of text data into audio data and vice versa. A palm- 
top device synchronizer module 50 allows for the acquisition of data from palm-top devices 
during routine synchronization procedures, which such palm-top devices are capable of 
performing in conjunction with another computer. An audio/visual module 52 allows for the 
capture, storage and retransmission of audio and video data. An optical character recognition 
(OCR) module 54 allows for the conversion of scanned text into editable text data. The 
specialized client modules shown in Fig. 1 are merely representative. Additional modules to 
perform specialized manipulations for gem data objects may be added to the client module 40, 
as will be evident to one of skill in the art. 

A data viewer 62 allows the user to interact with the system 100 to manipulate raw 
electronic data and gem data objects in several ways. Such user interactions include, but are 
not limited to: capturing original data; gathering, refining, and editing gem data objects; 
classifying and storing gem data objects; and retrieving and using gem data objects. In one 
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embodiment, the data viewer 62 provides a graphical user interface that displays all gem data 
object database information in a hierarchical format. 

A data object encoder/decoder 58 packs information from the client module 40 for 
transporting to a server module 70. During data acquisition, the data object encoder/decoder 

5 58 will package the newly created gem data object into a format suitable for transport. The 
data object encoder/decoder 58 also unpacks information received back from the server 
module 70. A system clipboard data manager 60 allows a user to cut and paste data. The 
clipboard data manager 60 handles interpreting a selected gem data object during copy and 
paste or drag and drop actions, and makes the selected data available to the requesting 

10 application. 

After the creation and encoding of a gem data object within the client module 40, 
encoded gem data object 64 is sent to a data transmission module 66 for transport to the server 
module 70. The specific implementation of the data transmission module 66 will vary based 
upon the type of client/server architecture within the system 100. For example, for a local, 
25 single user installation, no transmission protocol is required, as data transmission may be done 
using standard in-process communication mechanisms. For a remote server module, some 
means of requesting data, or operations on data, and for receiving a response is required. The 
data transmission module 66 can therefore use any number of different data transmission 
protocols, which are well known in the art. 

20 The server module 70 receives the encoded gem data objects from the data 

transmission module 66. The server module 70 encapsulates all data access to an underlying 
persistent storage mechanism 80. The server module 70 can run in or out of process with the 
client module 40, and can run either locally, as would be the case with a single user version of 
the system, or remotely, as in a client-server implementation or over the Internet, via standard 

25 protocols. 

A data object encoder/decoder 72 unpacks the transmission, in this example a newly 
created gem data object, received from the client module 40. The server module 70 passes the 
transmission to a data object router 74, which processes client requests and routes data to the 
appropriate gem object database location. The data object router 74 may route data based 
30 upon routing information received with a transmission, or may route according to pre-set rules. 
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The server module 70 interacts with the persistent storage 80 to persist the data. The 
newly created gem data object will thus be stored in the persistent storage 80 in a gem object 
database. Various different embodiments of a gem object database will be evident to one of 
skill in the art. 

The system 100 of Fig. 1 may also be used for data retrieval. Data retrieval may 
include, but is not limited to, retrieving individual gem data objects and performing queries for 
information upon the gem object database. 

When using the system 100 for data retrieval, a user will input requests for information 
into the data viewer 62. The data viewer 62 provides a graphical user interface for viewing the 
contents of the gem object database and performing searches of the gem object database. In 
addition, the data viewer 62 provides standard graphical means (such as drag and drop or copy 
and paste) for organizing data via the movement, addition, and deletion of categories and gem 
data objects. 

The client module 40 receives the request from the data viewer 62. The data object 
encoder/decoder 58 packages the request for transmission to the server module 70. The 
encoded request is transmitted to the server module 70 via the data transmission module 66. 
The server module 70 receives the request from the data transmission module 66, decodes it 
using the data object encoder/decoder 72, and submits it to a query processor 76. 

The query processor 76 forms database queries based upon requests received from the 
client module 40 or generated locally within the server module 70. These queries are used to 
search the gem object database and respond to requests for database searches. The response is 
then encoded via the data object encoder/decoder 72 and retransmitted back to the client 
module 40. The client module 40 will display the request results to the user via the data 
viewer 62. 

The query processor 76 and a data analysis processor 78 perform additional server 
module 70 tasks. The data analysis processor 78 analyzes the gem object database stored 
within the system 100 to create general profiles of system 100 users, and to create specific 
profiles of topics of interest within the gem object database. The data derived from these 
profiles may then be formed into queries. These queries may be submitted to Internet search 
engines as well as to the query processor 76 to provide a user with Internet sites or local files 
of potential interest. 

10 
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The various modules disclosed with respect to Fig. 1 are merely representative. The 
functionality of the various different modules may be combined into a non-modular electronic 
data -gathering system without departing from the inventive concepts disclosed herein. 
Various different modules may also be combined into a single module as will be evident to 
5 one of skill in the art. 

The embodiment disclosed above is implemented in software. The software is 
embodied in a computer readable medium, suitable for use with a computer system. Such a 
computer system may comprise a plurality of processors combined with a data viewscreen; 
however, other embodiments will be evident to one of skill in the art. 

w Fig. 2 shows a flowchart of a research workflow method of the present invention. The 

method of Fig. 2 is used with the electronic data-gathering system 100 of Fig. 1 to perform 
research operations. During a gathering phase 210, important and relevant information is 
chosen and collected. Next, the information is stored in a storage phase 220. Finally, the 
information may be retrieved, used, and manipulated in a performing data actions phase 230. 

15 Although the process shown here is a linear one, the steps can be performed in any order. For 
example, a user could perform data actions upon the data before storage. 

B. Gathering Phase 

Fig. 3 shows further details of an embodiment of the gathering phase of the research 
workflow method for use with an electronic data-gathering system. Data is first acquired in 
20 step 310, and then data is processed into a gem data object in step 320. 

Figs. 4A and 4B show two different embodiments of the acquire data step. Four 
different methods of selecting data are shown: interaction with the system clipboard 410; drag 
and drop activity 420; manual typing-in of data 430; and electronic scanning-in of data 440. A 
system user may capture data electronically using any of these different methods. A user may 
25 also alternate between the different methods for selecting data as convenience dictates. 

A user may capture any kind of electronic data in the acquire data step. The list of 
potential types of data that may be selected includes, but is not limited to: text type data, 
including Rich Text Format (RTF), American Standard Code for Information Interchange 
(ASCII) and various word-processing native formats; graphical files including Joint 
30 Photographic Expert Group (JPEG) and Graphics Interchange Format (GIF) type files; URL 
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links; World Wide Web (WWW) pages, including Hyper Text Markup Language (HTML) 
files; File links; and Object Linking and Embedding (OLE) type object links. 

Different types of gem data objects may be created from the data types listed above. In 
addition, gem data objects may be created from other types of data as will be evident to one of 
5 skill in the art. Support for additional gem data object formats may be included by coding and 
installing the appropriate module. There is no limit on the type and number of potentially 
supported formats except for the disk space and memory of the hardware platform chosen by 
the user. 

In Fig. 4 A, these four different methods of selecting data 410, 420, 430 and 440 are 
10 used to input data into a data target 400. The data target 400 is a free-floating, movable icon 
symbolizing the active research system. The data target 400 eliminates the need for a user to 
open and interact with additional application windows, while still allowing the user to access 
key features of the system. As shown in Fig. 4A, electronic data may be input directly into the 
data target without the need for an additional view window. In this case, the data being input 
15 is either pasted or dragged onto the data target 400. The data target 400 provides a target for 
inputting data into the research system, and also provides a visual indication that the electronic 
data-gathering system is running. The user can add data to any level of the tree structure of 
the research system. If data is added to data target 400, the data is placed at a default location 
in the tree. 

20 In Fig. 4B, the aforementioned methods of selecting data 410, 420, 430 and 440 are 

used to input data into a data viewer window 450. In this embodiment, the data target 400 has 
been expanded to allow a view into the research system. The data viewer window 450 
provides a user with an expanded view of the research system, while still allowing a user to 
work concurrently with other program applications. 

25 Fig. 5 shows an embodiment of the method for processing the selected raw data into a 

gem data object. In one embodiment, the steps of acquiring metadata 510 and creating an 
interim object 520 are performed by the analyzer/conduit 30. However, other embodiments 
will be evident to one of skill in the art. For example, the metadata could be passed directly to 
the main routine without creating an interim object. The main processing thread of either the 

30 client module 40 or the server module 70 is referred to herein as the main routine. 
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First, metadata is acquired in step 510. This metadata is information about the original 
data obtained in the gathering phase. Such metadata may include, but is not limited to, 
information about the source of the original data, such as: a Web page address where the 
original data was found; the filename of the source of the data; or the date the data was copied. 
5 The metadata may also include attribution data required to format a proper bibliography, such 
as: the original author; date of publication; or specific location or page number where the 
original data was located within the source document 

The metadata is used to create an interim object in step 520. This interim object is then 
passed to the main routine via a capture event 530. The main routine is then used in step 540 
70 to create a gem data object from the interim object. 

Fig. 6 shows an embodiment of the method of operation for the analyzer/conduit 30 to 
create an interim object. The steps shown in Fig. 6 encompass steps 510 and 520 from Fig. 5. 
Steps 610-660 all relate to acquiring metadata and may be performed in any order. Not all 
steps will be performed in all embodiments. 

25 In step 610, user data is added to the set of metadata being collected. The operating 

system (OS) Application Program Interface (API) is used to acquire the user data, which may 
include: information about the system user who is collecting data; the name or designation of 
the machine being used; and the system date and time. In step 620, source data is added to the 
metadata. The OS API is used to determine the source application for the original data as well 

20 as the designation of fee document providing the original data. For example, if the original 
data was a web page, the source application might be a web browser while the document 
designation would be a URL. However, in certain situations this information may not be 
available. In such cases, a best-guess routine is used which attempts to determine the 
application name and document name providing the data. The routine makes a guess based 

25 upon the application window that was last active before the user input data into the electronic 
data-gathering system. 

Bibliographic data is added in step 630. The captured data is scanned to determine 
such information as the author, the publication date, the original document title, and other 
attribution data. When this data is not available, a best-guess routine is used which suggests a 
30 value from the captured data, for example "meta" tags in HTML files, or embedded tags in 
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RTF files. Optionally, the original source data may also be scanned to determine a best guess 
for this information. If no data is available to make a best guess, the field is left blank. 

A default name and description is added in step 640. The captured data is scanned to 
suggest a default name for the gem data object that will be created. This name is typically 
derived from the first 3-4 words of captured text, but other implementations will be evident to 
of skill in the art. The description will default to "<data type> from <source documents" 



one 



o 
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20 



25 



A keyword scan is performed in step 650. Any text associated with the captured data 
is scanned and separated into individual words. A dictionary of "small words" is applied to 
discard words that do not make meaningful keywords. The resulting collection of keywords 
may be used for multiple purposes, such as suggesting related search terms or routing the 
resulting gem data object after it is created. 

The data is scanned for embedded URLs in step 660. Any embedded URLs found are 
stored and may be used for multiple purposes, such as suggesting related web links or routing 
the resulting gem data object after it is created. 

Interim object packing is performed in step 670. The collection of captured data and 
metadata is packaged into an interim object that may be passed to the main routine for further 
processing. Not all embodiments perform this step. 

Fig. 6 shows only one embodiment of the various types of metadata that may be 
captured by the electronic data-gathering system. Many additional types of metadata may be 
captured and encapsulated along with the original captured data, as will be evident to one of 
skill in the art In one embodiment, metadata is initially captured automatically by the 
electronic data-gathering system. However, a user may also add metadata to a gem data object 
as well. The user of the data-gathering system may also decide not to capture certain types of 
metadata disclosed herein without departing from the scope of the present invention. 

Fig. 7 is a flowchart of a method to create a gem data object from captured data and 
metadata. The steps shown in Fig. 7 encompass step 540 of Fig. 5. The steps of Fig. 7 are 
performed by the main routine. 

In step 710, the analyzer/conduit 30 notifies the main routine of available new data via 
an event notice. This prompts the main routine to retrieve the interim object in step 720. The 
0 main routine also retrieves an empty gem data object from the server module in step 730. The 
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main routine examines the captured data in order to recognize the data type, and applies the 
client module or modules corresponding to that data type to the captured data in step 740. In 
step 750 the main routine populates the fields of the empty gem data object with the data from 
the interim object. 

5 The resulting gem data object is an implementation of the encapsulated original data 

and metadata along with the functionality required to use, view, and manipulate the captured 
data. The gem data object is a single shareable package of data and metadata. This gem data 
object package is a "self-attributing" unit - it contains bibliographic information as well as 
information about how an appropriate bibliography, footnote, or caption should be formatted. 
10 This enables the gem data object to generate its own bibliography or other identifying caption 
about itself. 

Figs. 8, 9, and 10 are diagrams showing an embodiment of a user interface for the 
electronic data-gathering research system. The user interface provides one or more views into 
the research system model and displays information about the gem data objects within the 
25 system. The views also provide a user interface for adding, modifying, or deleting the original 
data or metadata encapsulated by the gem data object. The windows shown in Figs. 8, 9, and 
H) are merely representative of the type of information that may be displayed to describe a 
gem data object 

In Fig. 8, a window 810 shows a "Gem View" window providing a hierarchical view 
20 of the stored gem data objects within the system. The individual gem data objects may be 
organized in a hierarchical manner by grouping them into containers. Each container is a 
grouping of individual gem data objects, or a grouping of containers. This hierarchical storage 
method provides a convenient way for a user to organize the gem data objects within the 
research system. For example, in the window 810, "PTO Website" constitutes a container 
25 812, while "Unfiled" constitutes another container 816. A gem data object "Functions of 
Patent and Trademark Office" 814 is stored in the container 812, while a gem data object 
"Future paper" 818 is stored in the container 816. In this example, gem data object "Functions 
of the Patent and Trademark Office" 814 has a data type of "text". 

A window 820 shows an expanded view of a popup window for the gem data object 
30 "Functions of Patent and Trademark Office" 814. A "General" tab 822 of the popup window 
820 has been selected. The tab 822 displays information regarding the gem data object's name 
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and descriptive notes regarding the gem data object. The tab 822 also delays the container m 
which the gem data object is located, as well as information about the original source 
application for the data contained in the gem data object. Tab 822 provides a user interface for 
viewing, adding, deleting, or modifying the displayed information regarding a gem data 
5 object 

Fig. 9 A contains the same window 810 showing a "Gem View" window providing a 
hierarchical view of the stored gem data objects within the system as shown in Fig. 8. 
However, in Fig. 9A a popup window 920 for the gem data object 814 has a "Bibliography- 
tab 922 selected. The tab 922 displays bibliographic information regarding the original source 
10 data, such as: the author's name; publication; the place published; the edition or volume; and 
the page numbers. The tab 922 also displays the original article title and date of publication, 
as well as the URL of the source document. Tab 922 provides a user interface for adding, 
deleting, or modifying the displayed information regarding a gem data object. 

Fig. 9B shows a second view of the popup window 920 with the "Bibliography" tab 
15 922 selected. In Fig. 9B, a scrolling menu 924 is displayed for the "Bibliography Format- 
option. The scrolling menu 924 allows a user to select from various different types of standard 
bibliographic formats for citing printed and electronic resources. The bibliography format 
field of the gem data object contains an integer value code representing the chosen type of 
bibliography format. 

20 The bibliography entry itself is composed "on-the-fly" when requested by the user. 

The user can request the insertion of a bibliographic entry, footnote, or other type of caption 
by using drag and drop or copy and paste interactions. Once requested, a method of the gem 
data object is invoked wherein the format code is evaluated. The data fields corresponding to 
source information (such as author, publication, etc.) are then strung together using the 
25 appropriate formatting (such as italics, quotation marks, etc.) and made available to the target 
application via the system clipboard. 

Fig. 10 contains the same window 810 showing a "Gem View" window providing a 
hierarchical view of the stored gem data objects within the system as shown in Figs. 8 and 9A. 
However, in Fig. 10 a popup window 1020 for the gem data object 814 has a "View" tab 1022 
30 selected. The tab 1022 provides a view of the original captured data. Tab 1022 also provides 
a user interface for adding, deleting, or modifying the data contained in the gem data object. 
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In the example shown, the user can perform standard editing operations such as changing the 
font, cutting and pasting, justifying the text, etc. 

C. Storage Phase 

After a gem data object is created in the gathering phase, it is stored so that it can be 
5 used in future research. Multiple different types of processes may be performed using the 
accumulated stored data. For example, individual gem data objects may be retrieved, the 
storage database can be reordered or reorganized, and the storage database can be searched. 
The actual type of persistent storage system used is masked from the underlying workings of 
the system by the server component. The role of persistent storage may thus be implemented 
w in any of a number of commercially available database systems, as will be evident to one of 
skill in the art. 

Fig. 1 1 is a flowchart of a storage phase for a research workflow method. Fig. 1 1 
relates to the initial storage of a new gem data object through interaction between the client 
module 40, the server module 70, and the persistent storage module 80 as shown in Fig. 1 . 
15 Various different types of client/server architecture may be used in the research workflow 
method of the present invention. For example, in a single-user version of the system, the 

i: 

server module could be local to the client module. Alternatively, a more distributed 
client/server system could be used wherein the server module is remote from the client 
module. Although specific examples of client/server architectures will be presented herein, 
20 the present invention encompasses all of the various types of client/server architectures 
suitable for use with the present invention. 

In the storage phase of Fig. 11, the gem data object is packaged by the client module 
40 for transmission to the server module 70. Optionally, the raw data comprising the gem data 
object may first be compressed in step 11 12. Next, the gem data object is encoded in step 

25 1114, where it is converted into a form suitable for transfer over existing network protocols. 
In one embodiment, extensible Markup Language (XML) is used to encode the gem data 
• object and an XML-RPC (Remote Procedure Call) is used to wrap the gem data object in a 
method call that provides the data and any additional required parameters to the server module 
70. For example, the XML-based implementation would contain a method call requesting that 

30 the gem data object be saved, information about the database in which to save it, and other 
required parameters. 
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The encoded data is then transmitted in step 1120 from the client module 40 to the 
server module 70. In an XML-based embodiment, the client module 40 contacts the server 
module 70 on a well-known port using Hyper Text Transfer Protocol (HTTP) and Java 
Servlets, and a socket is opened. Upon establishing a successful connection, the client module 
40 transmits the XML-RPC method call containing the XML-encoded gem data object. 

The server module 70 then decodes the gem data object in step 1132, and 
decompresses the gem data object in step 1134 if the gem data object was compressed in step 
1112. In an XML-based embodiment, the XML tags are parsed and the transmitted gem data 
object is rebuilt and forwarded for routing, along with passed parameters such as the name of 
the requested storage database. 

In step 11 36, the gem data object is routed to a storage location. The transmitted gem 
data object may already have a pre-assigned database location. If so, this information is 
passed along with the gem data object and used for routing purposes. 

If location or other required information is missing, the server module 70 can apply 
pre-set rules that have been configured on the server module 70 to the gem data object Such 
pre-set rules can be developed for routing based upon any type of data or metadata contained 
within the gem data object. For example, if a destination location within the database is not 
supplied, a rule might exist which routes all data containing a particular phrase to a particular 
location. Such a rule could instruct all gem data objects containing the phrase "trademark" to 
be placed into the "Patent and Trademark Office" container. In another example, if the gem 
data object comes from a source, such as a URL, that is known to provide offensive material, 
the data could be discarded instead of being stored. 

The pre-set rules on the server module may also relate to adding or modifying certain 
metadata associated with gem data objects. Such rules would allow a user to perform large 
amounts of data gathering without having to interact with each individual data set as it is 
collected. For example, a user could create a rule to apply a default bibliography to a gem data 
object for a certain source author. Some examples of pre-set server rules are: 

If <dataobject> contains <keywords: x, y, x> then <locate in> Container O 

If <dataobject.author> contains <text: x, y, z> then <copy bibliography> <from item 

o F> 



18 



WO 00/39713 



j 



PCT/US99/30965 



After the application of any pre-set rules and routing has occurred, the gem data object 
is stored in step 1 140. As noted above, this may be accomplished using a variety of different 
types of persistent storage 80. 

D. Performing Data Actions Phase 

5 There are a variety of data actions that may be performed using the electronic data- 

gathering system. For example, individual gem data objects may be retrieved, modified, 
deleted, or reorganized. Individual gem data objects may also be copied or cross-referenced to 
other locations within the database. A cross-reference is a pointer to a gem data object that is 
located elsewhere within the database. The gem object database may be searched, sorted, or 

10 reorganized. The contents of the gem object database may be displayed, for example in a 
hierarchical format showing the location of gem data objects within different containers. 

Fig. 12 is a flowchart of a performing data actions phase for a research workflow 
method. In one embodiment, these actions involve interactions between the client module 40 
and the server module 70. The client module 40 is responsible for identifying data sets, 
75 performing a first-pass analysis of the data, presenting a user interface to the user, and 
requesting single gem data objects or groups of objects according to search and browse 
interactions with the user. The server module 70 responds to search and browse requests, 
modifies data, processes detailed data analysis, and performs automated data routing where 
applicable. 

20 For example, if a system user drags a gem data object from the data viewer 62 to a 

word processing program, the client module 40 generates a request to the server module 70 to 
retrieve the requested gem data object. The server module 70 responds by providing the gem 
data object or returning an error message. The client module 40, via the system clipboard data 
manager 60, takes the gem data object and evaluates its data. The client module 40 then 

25 makes the data available to the word processing application via the system clipboard. 

Referring to Fig. 12, in step 1210, a request is received from the client module 40. The 
request is decoded in step 1220 so that it can be analyzed and processed. 

The request is checked to see if the instructions are to add a new gem data object to the 
gem object database (step 1230). If yes, the request is then checked to determine if a filing 
30 preference has been included to route the new gem data object within the gem object database 
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(step 1232). If no filing preference is included, the server module 70 checks to see if there are 
pre-set rules which determine the routing of the gem data object (step 1236). The gem data 
object is then routed according to the included filing preferences or pre-set rules (step 1234). 

However, if no filing preference or pre-set rules exist, the new gem data object will be 
5 routed to the "Unfiled" category or gem data object container (step 1238). 

If the request is not to add a new gem data object, the request is checked to see if a data 
update is requested (step 1240). If yes, the server module 70 processes the request to update 
data by either editing or deleting data (step 1242). Examples of a data update request include 
modifying an existing gem data object, or cross-referencing or copying a gem data object to 
10 another location within the gem object database. 

Otherwise, the request is an information-only query, which is processed by the server 
module 70 as shown in step 1250. Examples of an information-only query include expanding 
a node in a hierarchical depiction of the gem object database to view the contents of a 
particular gem data object container, or requesting a report on the database contents. A user 
15 could request a report based upon a search of the gem object database for a particular keyword 
or gem data object field. Some examples are: 

If <dataobject> contains <keywords: x, y, x> then <display listing of dataobject> 

If <dataobject.author> contains <text: x, y, z> then <display dataobject> 

Upon completion of the request, the server module 70 encodes the response in step 
20 1260 and returns the response to the client module 40 in step 1270. As discussed regarding 
the storage of gem data objects, a variety of different implementations may be used to encode 
and process client module 40 requests. One embodiment is an XML-RPC method call, but 
other embodiments will be evident to one of skill in the art. 

Additional tasks may also be performed within the research system. As discussed 
25 above, the server module 70 responds to requests for information from the client module 40 by 
querying the gem object database. Additionally, the server module 70 may perform 
background tasks to aid the user of the electronic data-gathering system in identifying other 
potential sources of research information. A data analysis process is performed by the data 
analysis processor 78 as a background server module 70 task. The purpose of the data analysis 
30 process is to analyze the user of the electronic data-gathering system, as well as the user's 
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topics of interest, to make suggestions for further data gathering which may be forwarded to 
the client module 40. 

The data analysis process performs two types of analysis passes. First, a general 
profile of the user of the electronic data-gathering system is created. Second, specific profiles 
5 are created for one or more topics of interest. If multiple users exist for the gem object 
database, the data analysis process will create individual user-specific profiles, and may 
additionally create profiles for topics of interest that have multiple users. 

Creating a user profile is performed by deriving generalizations about the type of data 
stored by the user. Such an analysis may include, but is not limited to: the naming and depth 

10 of storage containers and sub-containers; the keywords associated with gem data objects; and 
the sources for the gem data objects. The data so obtained may then be compared against 
internally maintained demographics tables to determine the system user's most likely 
occupation, level of computer sophistication, areas of interest, most valued information 
sources, and various other attributes. The data may also be compared against databases 

15 provided by external sources. The data analysis process uses a similar technique to focus in on 
a specific topic of interest within the gem object database. 

In one embodiment, the data analysis process compares the text of the gem data objects 
against stored tables of word roots. This process is most effective for areas of expertise that 
are well defined. For example, medical, legal, and technology professionals share 
20 vocabularies that can be distilled into word roots. The presence of these roots within the 
captured data can reveal primary and secondary occupations or areas of interest. 

The data analysis process then uses the user profile and topical profiles to generate 
additional potential areas of research for the system user. Queries based upon the profiles are 
submitted to Internet search engines to generate Web site addresses of potential interest. 
25 Queries are also submitted to the query processor 76 of the server module 70, which queries 
the local gem object database and any other local files systems for additional gem data objects 
or files of potential interest. The queries may also suggest other users of the system with 
similar interests or research topics, thereby fostering collaboration between users who may not 
know each other. 
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For example, the data analysis process may reveal that a particular user has created a 
gem data object collection with the following hierarchy of containers: 



The U.S. Constitution 

• Articles 

• Amendments 

• Recent decisions 
Legal Research 

• Patents 

o The U.S. PTO 

Personal 

• DVD Players 

• Vacations 

o Bali 
o Jamaica 
o Hawaii 



10 Given this example, the data analysis process would infer that: the system user is a 

lawyer or legal scholar who is currently interested in Constitutional and Patent law; the user 
has a medium to high computer sophistication; the user may be shopping for a new DVD 
player, and the user may take a vacation to a tropical island in the near future. Based upon this 
information, the data analysis process would submit a query such as "tropical island vacation" 
35 to one or more Internet search engines and return the top ten resulting list of sites as a 
suggestion to the system user for future research. The data analysis processor 78 would also 
submit a query to the local query processor 76 to check local files and gem object databases 
for information regarding tropical island vacations. The data analysis process would perform 
similar queries for the other topical areas of research found. 

20 One embodiment of a data analysis process is show in Fig. 13. In step 1300, a pass 

through an data containers is made to search for keywords and phrases. Text labels are 
parsed, and small words are discarded. In step 1310, a data pass is made through all individual 
gem data objects to search for keywords and phrases. The data is parsed and small words are 
again discarded. In step 1320, a count is made of the most used data sources, such as URLs, 

25 local files, and network or system files. Steps 1 300-1 320 may be performed in any order. 

In step 1340, a comparison is made between the information obtained in steps 1300- 
1320 and a stored database of word roots 1330. The stored database of word roots provides an 
association between common data found in steps 1300-1320 and particular professions or 
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fields of study. For example, the root "ombro" is shown in the database of word roots 1 330 as 
being associated with medicine and psychology. 

The comparison of step 1340 generates additional information. A user profile 1350 is 
generated, containing: an occupation and/or topics of research; favored data sources; and 
5 suggested search queries. In step 1360, these suggested types of search queries are submitted 
and the results evaluated. In step 1370, additional gem data objects are created out of the 
results of the search query. Finally, in step 1380 the newly-created gem data objects are 
placed in a container within the user's gem object database. 

E. Examples of Gem Data Objects 

10 Fi gs. 14-17 show one embodiment of a gem data object formatting scheme. This gem 

data object scheme may be used to implement a client module and a server module that 
operate upon the gem data objects and allow users to interact with a gem object database. It 
should be understood that the coding and formatting scheme for implementing the electronic 
data-gathering system is not limited to the embodiments disclosed herein. Other formatting 

15 methods will be evident to one of skill in the art 

The gem data objects in the scheme shown in Figs. 14-17 are organized in a 
hierarchical "tree" structure. Within the tree structure, there is a "root" node that contains the 
tree structure within the gem object database. All members of the tree structure are part of this 
root node. There are also various levels of container objects that represent inner nodes and 
20 empty nodes of the tree structure. Finally, the "leaf nodes of the tree structure are represented 
by individual items, corresponding to individual gem data objects. The example structure of 
Figs. 14-17 is directed towards a data tree that contains information about scary books. 

Fig. 14 shows an example of a root node. A root node may contain only other 
containers. The root created in Fig. 14 is named "Examples", and becomes the root node of a 
25 tree containing examples of scary books. As given in line 1410, the server Universal Resource 
Identifier (URL) for this root node is "gemserver.gemteq.com." This URI allows the client and 
server modules to locate the root node. 

Fig. 1 5 shows an example of a container node. A container node may contain other 
containers or individual gem data objects, or both. The container of Fig. 15 is named 
30 "ScaryBooks." As given in line 1509, the "ScaryBooks" container has a URI of 

j 
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"gemserver.gemteq.com/Examples" This URI indicates the "ScaryBooks" container is 
located in the "Examples" root node. 

Fig. 16 shows another example of a container. The container of Fig. 16 is named 
"StephenKing." This container has a URI of "gemserver.gemteq.com/Examples/ScaryBooks", 
as given in line 1609. This URI indicates the "StephenKing" container is located in the 
"ScaryBooks" container in the "Examples" root node. 

Fig. 17 shows an example of an individual item. The item of Fig. 17 is named 
"Carrie". As given in line 1709, this item has a URI of 

"gemserver.gemteq.conVExamples/ScaryBooks/StephenKing." This URI indicates the "Carrie 
item is located in the "StephenKing" container, in the "ScaryBooks" container, in the 
"Examples" root node. 

A client would use the above example of a tree structure to find and route , gem data 
objects. For example, the client module 40 would parse the URI of the "Carrie" item shown in 
Fig. 17 in the following way. The server module 70 would be indicated as 
"gemserver.gemteq.com". The data tree would be given by the root node "Examples". The 
path would be given by the two container nodes "ScaryBooks/StephenKing". The actual gem 
data object would be the item "Carrie." 

In an XML-based embodiment, using this path information, the client module 40 
would make a RPC to the server module 70 using standard HTTP protocols to request the 
"Carrie" gem data object. The server module 70 would then reply with a standard HTTP 
response that contains the XML-RPC response that in turn contains the "Carrie" gem data 
object requested. The XML portion of the response may be of many different types to 
represent the success or failure of the RPC call. 

The embodiment of a gem data object "Carrie" as shown in Fig. 17 demonstrates that a 
gem data object is an encapsulation of the original electronic data combined with creation, 
attribution, and user-selected metadata. Line 1703 of the "Carrie" gem data object contains 
user-selected comments about the gem data object. Lines 1704-1710 of the "Carrie" gem data 
object contain metadata about the creation of the original gem data object: when the gem data 
object was created (line 1704); who created the gem data object (lines 1705-1708); and when 
o the gem data object was modified (line 1710). Lines 171 1-1722 contain attribution metadata 
as well as information about how a bibliography for the "Carrie" gem data object should be 
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formatted. For example, line ! 711 gives the bibliography format "MLABook", which 
corresponds to one of the bibliography format choices from the popup menu 924 of Fig. 9B. 

The original data for the gem data object "Carrie" is also shown in Fig. 17. Lines 
1729-1731 list the text data that is encapsulated with the metadata listed above. The example 
"Carrie" is a text type of gem data object. However, a variety of other types of gem data 
objects are also available. 

Several different types of gem data objects are shown in Figs. 18-22. The overall 
structure for a gem data object will remain substantially the same regardless of the type of 
original data contained in the gem data object. Examples of data types that may comprise the 
original data have been discussed previously with regard to Figs. 4A and 4B. Each type of 
gern data object will contain the original data encapsulated with creation, attribution, and user- 
selected metadata. This allows for consistency in the use of gem data objects, and also allows 
new types of gem data objects to be easily incorporated into the gem object database. The 
gem data object types shown in Figs. 18-22 are merely examples of specific embodiments. It 
15 will be evident to one of skill in the art that various other types of gem data objects may be 
created, stored, and manipulated within the electronic data-gathering system. 

Figs. 1 8-22 show a user interface window for a gem data object viewer. In each figure, 
the "View" tab of the popup menu for a different gem data object is expanded. This creates a 
view of the original data encapsulated within the gem data object 

20 Fi S- 18 shows in a window 1870 a view of the gem data object "Functions of Patent 

and Trademark Office" which is a text type of gem data object. Fig. 19 shows in a window 
1970 another text type of gem data object "Text of plant patent form." As shown in the 
window 1970, text type gem data objects may contain different text formatting styles as well 
as plain text. Both ASCII and RTF text formats are supported, as well as a variety of word 

25 processing applications' native formats. 

Fig. 20 shows in a window 2070 a view of a file type of gem data object. The file gem 
data object "My briefcase - link" provides a link to the file "plantpatdoc." As shown by this 
file gem data object, it is possible to mark a set of data for research applications not only by 
saving the data itself, but by saving a link to the path and filename of the data which can later 
30 be used to retrieve the data. 
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A related type of gem data object is an OLE object. A OLE type gem data object 
contains: the data required to present the source data embedded in a target document; data 
about the providing application; and either a path to the data file or the raw data itself. 

Fig. 21 shows in a window 2170 a pictorial type of gem data object. The window 2170 
contains a view of the picture file "PTO header picture". Graphical file types, such as JPEG 
and GIF formatted files, may be stored as pictorial gem data objects 

Fig. 22 shows in a window 2270 a view of a URL type of gem data object. The URL 
gem data object "Patent and Trademark Office Home Page" contains a URL that provides an 
Internet link to the home page of the Patent and Trademark Office on the World Wide Web 
(WWW). Such URL type gem data objects may be created where the user does not wish to 
store the entire web page as a separate gem data object. 

File types may be stored in clipboard format or converted from native formats. 
Standard clipboard formats are available for a variety of data types, including: plain text 
(ASCII); rich text format; HTML text; Bitmap Image (BMP); Device Independent Bitmap 
Image (DIB); Metafile Image (WMF); Enhanced Metafile Image (EMF); and Icon Image 
(ICO). Other formats, such as GIF or JPEG, may be converted by the system clipboard or the 
electronic data-gathering system. 

Although the invention has been described in considerable detail with reference to 
certain embodiments, other embodiments are possible. As will be understood by those of skill 
in the art, the invention may be embodied in other specific forms without departing from the 
essential characteristics thereof. For example, different formats may be used to create gem 
data objects. Additionally, different types of client/server architecture may be used to 
implement the electronic data-gathering system. Accordingly, the present invention is 
intended to embrace all such alternatives, modifications and variations as fall within the spirit 
and scope of the appended claims and equivalents. 
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We claim: 

1 . In a computer system including a user interface and a persistent storage mechanism, 
a computer implemented method for encapsulating user-selected sets of electronic data, the 
method comprising: 

acquiring a set of data; and 

populating a plurality of gem data object field(s) with the set of data encapsulated 
with attribution, creation, and user-defined metadata to create a gem data 
object. 

2. The method of claim 1 further comprising: 

packaging the set of data into an interim object by appending a set of attribution, 
creation, and user-defined metadata before creating a gem data object. 



3. The method of claim 1 further comprising: 

storing a gem data object format field representing a proper format for an 
attribution caption for the gem data object; and 
15 formatting attribution and source metadata based upon the format field. 

4. The method of claim 3 wherein the attribution caption is a bibliography. 

5. In a computer system including a user interface and a persistent storage mechanism, 
a computer implemented method for encapsulating a user-selected set of electronic data, the 
method comprising: 

20 receiving a definition of the set of data in electronic format; 

capturing the set of data without opening a new program view within the user 
interface; and 

populating a gem data object field with the captured set of data to create a gem data 
object. 

25 6. The method of claim 5 including: 

packaging the captured set of data into an interim object by appending a set of 
creation and attribution metadata before populating the gem data object 
field. 



27 



WO 00/39713 



PCT/US99/30965 



7. The method of claim 6 wherein the set of metadata includes computer session 
information. 

8. The method of claim 6 wherein the set of metadata includes source information 
regarding the set of data. 

5 9. The method of claim 6 wherein the set of metadata includes bibliographic 

information regarding the set of data. 

10. The method of claim 6 wherein the set of metadata includes a default name and 
description of the set of data. 

1 1. The method of claim 6 wherein the set of metadata includes a set of keywords in 
10 the set of data. 

12. The method of claim 6 wherein the set of metadata includes a Uniform Resource 
Locator from the set of data. 

13. The method of claim 5 including: 

storing the gem data object in a persistent storage mechanism. 

14. The method of claim 5 wherein the set of data in electronic format is defined by a 
user interacting with the system clipboard. 

15. The method of claim 5 wherein the set of data in electronic format is defined by a 
20 user performing drag and drop activity. 

16. The method of claim 5 wherein the set of data in electronic format is defined by a 
user manually typing data into a gem data object user interface. 

25 17. The method of claim 5 wherein the set of data in electronic format is defined by a 

user using a scanner. 

18. The method of claim 5 wherein the gem data object is handled by a server module 
which interacts with a persistent data storage mechanism. 
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19. The method of claim 18 wherein the server module is independent of the persistent 
data storage mechanism implementation or vendor. 

20. The method of claim 18 wherein access to the persistent data storage mechanism is 
controlled by a security system consisting of user and group accounts, and permissions. 

21. The method of claim 18 wherein the server module is inprocess or remote from a 
client module. 

22. The method of claim 21 wherein the server module communicates with the client 
module via in-process communication mechanisms. 

23. The method of claim 21 wherein the server module communicates with the client 
module via standard Internet communication mechanisms. 

24. The method of claim 18 wherein the server module applies definable routing rules 
to the gem data object. 

25. The method of claim 24 wherein the definable routing rules allow the server 
module to determine the location in a database in which the gem data object will be stored. 

26. The method of claim 24 wherein the definable routing rules instruct the server 
module to discard the gem data object if the gem data object contains pre-specified data. 

27. The method of claim 18 wherein the server module includes a data analysis 
process operating on the data stored in the persistent data storage mechanism. 

28. The method of claim 27 wherein the data analysis process may determine a profile 
of a user of either the gem data object or the computer system. 

29. The method of claim 28 wherein the profile identifies the occupation of the user. 
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30. The method of claim 28 wherein the profile identifies a topic of interest of the 

user. 

31. The method of claim 28 wherein the profile is used to suggest a research topic for 
5 a first user of either the gem data object or the computer system. 

32. The method of claim 31 wherein the research topic is an Internet site. 

33. The method of claim 31 wherein the research topic is a local system file. 

10 

34. The method of claim 31 wherein the research topic is a gem data object. 

35. The method of claim 31 wherein the research topic identifies a second user of 
either the gem data object or the computer system. 

15 

36. The method of claim 27 wherein the data analysis process may determine the 
nature or subject of data-gathering done by a user of either a first gem data object or the 
computer system. 

2 o 37. The method of claim 36 wherein the data analysis process suggests a further 

research topic to the user. 

38. The method of claim 36 wherein the data analysis process suggests an Internet site 
to the user. 

25 

39. The method of claim 36 wherein the data analysis process suggests a local file to 
the user. 

40. The method of claim 36 wherein the data analysis process suggests a second gem 
30 data object to the user. 

41. A method for using one or more gem data object(s) in research applications, 
comprising: 
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accessing a stored set of one or more gem data objects, wherein the one or more 
stored gem data object(s) include: 
a set of encapsulated data; and 
a set of creation and attribution metadata; and 
5 manipulating one or more of the gem data object(s) within the stored set. 

42. A computer program product for an electronic data-gathering system, comprising: 
a computer readable medium that stores program code including: 

program code that acquires a set of data; and 
70 program code that populates a plurality of gem data object field(s) with the set 

of data encapsulated with attribution, creation, and user-defined 
metadata to create a gem data object. 

43. The computer program product of claim 42, further comprising: 

program code that packages the set of data into an interim object by appending a 
15 set of attribution, creation, and user-defined metadata before creating a gem 

y. . data object. 

44. The computer program product of claim 42, further comprising: 

program code that stores a gem data object format field representing a proper 
format for an attribution caption for the gem data object; and 
20 program code that formats attribution and source metadata based upon the format 

field. 

45. A computer program product for an electronic data-gathering system, comprising: 
a computer readable medium that stores program code including: 

program code that receives a definition of the set of data in electronic format; 
25 program code that captures the set of data without opening a new program view 

within the user interface; and 
program code that populates a gem data object field with the captured set of 

data to create a gem data object. 
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46 The computer program product of claim 45, further comprising: 

program code that packages the captured set of data r. o an interim object by 
appending a set of creation and attribution metadata before populating the 
gem data object field. 

47. The computer program product of claim 45, wherein the gem data object is 
handled by a server module which interacts with a persistent data storage mechanism. 

48. The computer program product of claim 47, wherein the server module applies 
definable routing rules to the gem data object. 

49 The computer program product of claim 47, wherein the server module includes a 
,0 data analysis process operating on the data stored in the persistent data storage mechanism. 

50. The computer program product of claim 49, wherein the data analysis process may 
determine a profile of a user of either the gem data object or the computer system. 

51 The computer program product of claim 49, wherein the data analysis process may 
determine the nature or subject of data-gathering done by a user of e«her a first gem data 

15 object or the computer system. 

52 A computer readable memory for storing original data packaged with attribution 
and creation metadata for access by an application program being executed on a data 

processing system, comprising: 

a data structure stored in said memory, the data structure including informatton 
20 resident in a database used by the application program and including: 

an encapsulation of the original data; 
a set of user-specific metadata; and 

a set of attribution and creation metadata, wherein the set of attribution and 
creation metadata is automatically added to the original data. 
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53. The computer readable memory of claim 52 wherein the original data is comprised 
of text data. 

54. The computer readable memory of claim 52 wherein the original data is comprised 
of pictorial data. 

55. The computer readable memory of claim 52 wherein the original data is comprised 
of an embedded object formatted in the Object Linking and Embedding standard. 

56. The computer readable memory of claim 52 wherein the original data is comprised 
of a Universal Resource Locator address. 

57. The computer readable memory of claim 52 wherein the original data is comprised 
of text in Hyper Text Markup Language format. 

58. The computer readable memory of claim 52 wherein the original data is comprised 
of the path and filename of the original source. 

r 59. The computer readable memory of claim 52 wherein the data structure contains 
programming code allowing an attribution caption to be automatically created and formatted 
for the data structure. 

60. The computer readable memory of claim 59 wherein the attribution caption is a 
bibliography. 

6L A computer data signal for storing original data packaged with metadata 
embodied in a transmission medium comprising: 
an encapsulation of the original data; 
a set of user specific metadata; and 

a set of attribution and creation metadata, wherein the set of attribution and 
creation metadata is automatically added to the original data. 
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1401 <Root vers="1 .0" id="21 8838" name="Examples" mime="appiication/x-gemteq-roof > 

1402 <rootName>Examples for example</rootName> 

1 403 <comments>Various things to play with that show a gem server in action 

1404 </comments> 

1 405 <creationDate>1 1 Aug 1 999 14: 1 8 </creationDate> 

1406 <creator> 

1 407 <creatorName> James Bondo</creatorName> 

1408 <creatorlD>jbondo</creatorlD> 

1409 </creator> 

1410 <server>gemserver.gemteq.com</server> 

1 41 1 <modificationDate> 1 3 Aug 1 999 1 2:03</modificationDate> 

1412 <defaultBib 

1413 format=7/bibserver.gemteq.com/BiblographyFonriats/GenericBook , '/> 

1414 <objList> 

141 5 <!~ all are mime: application/x-gemteq-chest by definition -!> 
1416 

1417 <obj id="21 0243234" name="ScaryBooks" > 

1418 <containerName>Scary Books</containerName> 

1419 </obj> 

1420 <obj id="2832834" name= n Various n /> 

1421 <containerName>Various</containerName> 

1422 </obj> 

1423 <obj id="21 98348" name="SallyHome" /> 

1 424 <containerName>Sally's Home Chest</containerName> 

1425 </obj> 

1426 </objList> 

1427 </Root> 
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1501 <cor^nerve re =-1.0^ 

1501 ^^"^^^e^e scary Books</conta.nerName> 

1 503 <comments>A collection of scary books</comrnents> 

] 504 < C reationDate>1 1 Aug 1999 14:1 9 </creationDate> 

1506 <Creat0r <creatorName> James Bondo</creatorName> 

1 507 <creatorlD>jbondo</creatorlD> 

IS 3 <Ob <o^id=-210243234- name="StephanKing" m ime=»application/x-gernteq-tray-> 

] 5U <itemName> Misc </itemName> 

i32 <oW W="218388848" name="DeanKcontz« rnirne=-appr.cation/x-gernteq-tray"> 

]l£ <itemName>Dean Koontz</itemName> 

1518 </ob i > 

1519 

1520 

1521 </ob)List> 

1522 </Container> 
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Container 
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1601 <Container vers="1.0" id="21 0243234" name="StephanKing" mime="appiication/x-gemteq-tray"> 



1602 <containerName>Stephan King</containerName> 

1 603 <comments>A collection of books by Stephan King</comments> 

1604 <creationDate>1 1 Aug 1999 14:19 </creationDate> 

1605 <creator> 

1606 <creatorName> James Bondo</creatorName> 

1 607 <creatorlD>jbondo</creatorlD> 

1608 </creator> 

1 609 <container>//gemserver.gemteq.com/Examples/ScaryBooks</container> 

1610 <modificationDate>13 Aug 1999 12:03 </modificationDate> 

161 1 <defaultBib format=7/bibserver.gemteq.com/BibilographyFormats/GenericBook" /> 

1612 <objList> 

1 61 3 <obj id="1 20983838" name="Carrie" mime="application/x-gemteq-textgem"> 

1614 <itemName>Carrie</itemName> 

1615 </obj> 

1 61 6 <obj id="65683838" name=TheStand" mime="application/x-gernteq-textgem ,, > 

1617 <itemName>The Stand</itemName> 

1618 </obj> 
1619 

1620 <obj id="777983838" name="IF mime="application/x-gernteq-textgern"> 

1621 <itemName>IT</itemName> 

1622 </obj> 

1 623 <obj id="76883838" name=" PetSeminary " mime="application/x-gemteq-textgem"> 

1624 <itemName>Pet Seminary</itemName> 

1625 </obj> 

1 626 <obj id="1 20983838" name=TheLangoliers" mime="application/x-gemteq-textgem"> 

1 627 <itemName> The Langoliers </itemName> 

1628 </obj> 

1 629 <obj id="298948393" name="MiscNotes" mime="application/x-gemteq-tray"> 

1630 <itemName>Misc Notes</itemName> 

1631 </obj> 

1 632 <obj id=°298948393" name="Pic" mime="application/x-gemteq-imagegem"> 

1 633 <itemName> Picture of Stephan King</itemName> 

1634 </obj> 
1635 

1636 </objl_ist> 
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Item 
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1701 < l ttmvers=M.0-id=-120983838-name=-Carrie- m ime=-applicaSon/x-ger nte q-textgem-> 

1702 <jtemName>Carrie</itemName> 

703 <comments>Psi and teenage angst. </comments> 

!704 <creationDate> 1 2 Aug 1999 1 5:20 </creatonDate> 

1706 <CreatD ^creatorName>Billy Joe Jim Bob</creatorName> 

1707 < C reatorlD>bjjbob</ceatorlD> 

1709 ! ^"!L n n a tP> 14 Aua 1999 16:20 </modificat»onDate> 

1719 <author>Stephan King</author> 

<pubiicationTitle>Carrie</publicationTitle> 

'4 < pu biisher>Random House</publisher> 

1 71 5 <| 0 cation>New York</location> 

1716 <ed>3</ed> 

1717 <vol>1 </vol> 

718 <date>2Aug1978</date> 

1719 <pageStart>21</pageStart> 

1799 </bib^ 

^23 <!_ an image may have the following attributes 

17 24 datamime="image/jpg B 

172 5 compression="application/x-zip" 

172 6 encoding="application/uuencode" 

1731 of goo upon her at the prom. 

1732 </data> 

1733 </ltem 
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